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ENGINEERING PLANT THIOESTERASES AND DISCLOSURE OF 
PLANT THIOESTERASES HAVING NOVEL SUBSTRATE 

SPECIFICITY 

5 

Technical Field 

The present invention is directed to proteins, nucleic 
acid sequences and constructs, and methods related thereto. 

10 INTRODUCTION 

Background 

Fatty acids are organic acids having a hydrocarbon 
chain of from about 4 to 24 carbons. Many different kinds 
of fatty acids are known which differ from each other in 

15 chain length, and in the presence, number and position of 
double bonds. In cells, fatty acids typically exist in 
covalently bound forms, the carboxyl portion being referred 
to as a fatty acyl group. The chain length and degree of 
saturation of these molecules is often depicted by the 

20 formula CX:Y, where "X" indicates number of carbons and "Y" 
indicates number of double bonds . 

The production of fatty acids in plants begins in the 
plastid with the reaction between acetyl -CoA and malonyl-ACP 
to produce butyryl-ACP catalyzed by the enzyme, £-ketoacyl- 

25 ACP synthase III. Elongation of acetyl-ACP to 16- and 18- 
carbon fatty acids involves the cyclical action of the 
following sequence of reactions: condensation with a two- 
carbon unit from malonyl-ACP to form a £-ketoacyl-ACP (£- 
ketoacyl-ACP synthase) , reduction of the ke to- function to an 

30 alcohol <£-ketoacyl-ACP reductase) , dehydration to form an 
enoyl-ACP (E-hydroxyacyl-ACP dehydrase) , and finally 
reduction of the enoyl-ACP to form the elongated saturated 
„ acyl -ACP (enoyl-ACP reductase) . £-ketoacyl-ACP synthase I, 

catalyzes elongation up to palmitoyl-ACP (C16:0), whereas fi- 

35 ketoacyl- ACP synthase II catalyzes the final elongation to 

stearoyl-ACP (C18:0). The longest chain fatty acids produced 
by the FAS are typically 18 carbons long. A further fatty 
acid biochemical step occurring in the plastid is the 

1 
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desaturation of stearoyl-ACP (C18:0) to form oleoyl-ACP 
(C18:l) in a reaction catalyzed by a A-9 desaturase, also 
often referred to as a "stearoyl-ACP desaturase" because of 
its high activity toward stearate the 18 carbon acyl-ACP. 
5 Carbon-chain elongation in the plastids can be 

terminated by transfer of the acyl group to glycerol 3- 
phosphate, with the resulting glycerolipid retained in the 
plastidial/ "prokaryotic" , lipid biosynthesis pathway. 
Alternatively, specific thioesterases can intercept the 

10 prokaryotic pathway by hydrolyzing the newly produced acyl- 
ACPs into free fatty acids and ACP. 

Subsequently, the free fatty acids are converted to 
fatty acyl-CoA's in the plastid envelope and exported to 
the cytoplasm. There, they are incorporated into the 

15 "eukaryotic" lipid biosynthesis pathway in the endoplasmic 
reticulum which is responsible for the formation of 
phospholipids, triglycerides and other neutral lipids. 
Following transport of fatty acyl CoA's to the endoplasmic 
reticulum, subsequent sequential steps for triglyceride 

20 production can occur. For example, polyunsaturated fatty 
acyl groups such as linoleoyl and a-linolenoyl , are 
produced as the result of sequential desaturation of oleoyl 
acyl groups by the action of membrane -bound enzymes . 
Triglycerides are formed by action of the 1-, 2-, and 3- 

25 acyl-ACP transferase enzymes glycerol-3 -phosphate 

acyltransf erase, lysophosphatidic acid acyl transferase and 
diacylglycerol acyltransf erase . The fatty acid composition 
of a plant cell is a reflection of the free fatty acid pool 
and the fatty acids (fatty acyl groups) incorporated into 

30 triglycerides as a result of the acyltransf erase 

activities. The properties of a given triglyceride will 
depend upon the various combinations of fatty acyl groups 
in the different positions in the triglyceride molecule. 
For example, if the fatty acyl groups are mostly saturated 

35 fatty acids, then the triglyceride will be solid at room 
temperature. In general, however, vegetable oils tend to 
be mixtures of different triglycerides. The triglyceride 
oil properties are therefore a result of the combination of 
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triglycerides which make up the oil, which are in turn 
influenced by their respective fatty acyl compositions. 

Plant acyl-acyl carrier protein thioesterases are of 
biochemical interest because of their roles in fatty acid 
5 synthesis and their utilities in bioengineering of plant 
oil seeds. A medium-chain acyl-ACP thioesterase from 
California bay tree, Uwbellularia calif ornica, has been 
isolated (Davies et al . (1991) Arch. Biochem. Biophys . 
290:37-45), and its cDNA cloned and expressed in E.coli 

10 (Voelker et al . (1994) J. Bacterial, 176:7320-7327) and 

seeds of Arahidopsis thaliana and Brassica napus (Voelker 
et al. (1992) Science 257:72-74). In all cases, large 
amounts of laurate (12:0) and small amounts of myristate 
(14:0) were accumulated. These results demonstrated the 

15 role of the TE in determining chain-length during de novo 
fatty acid biosynthesis in plants and the utility of these 
enzymes for modifying seed oil compositions in higher 
plants . 

Recently, a number of cDNA encoding different plant 

20 acyl-ACP thioesterases have been cloned (Knutzon et al . 

(1992) Plant Physiol. 100:1751-1758; Voelker, et al . (1992) 
supra; Dormann et al. (1993) Planta 159:425-432; Dormann et 
al. (1994) Biochim. Biophys. Acta 1212:134-13 6; Jones et 
al. (1995) The Plant Cell 7:359-371). Sequence analyses of 

25 these thioesterases show high homology, implying similarity 
in structure and function. Some of these thioesterase cDNAs 
have been expressed in E.coli, and their substrate 
specificities determined by in vitro assays. The fact that 
these enzymes share significant sequence homology, yet show 

30 different substrate specificities, indicates that subtle 
changes of amino acids may be sufficient to change 
substrate selectivity. 

Little information is available on structural and 
functional divergence amongst these plant thioesterases, 

35 and the tertiary structure of any plant thioesterase has 

yet to be determined. Protein engineering may prove to be a 
powerful tool for understanding the mechanism of 
thioesterase substrate recognition and catalysis, and thus 
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lead to the rational design of new enzymes with desirable 
substrate specificities. Such new enzymes would find use 
in plant bioengineering to provide various modifications of 
fatty acyl compositions, particularly with respect to 
production of vegetable oils having significant proportions 
of desired fatty acyl groups, including medium-chain fatty 
acyl groups (C8 to C14) and longer chain fatty acyl groups 
(C16 or C18) . In addition, it is desirable to control the 
relative proportions of various fatty acyl groups that are 
present in the seed storage oil to provide a variety of 
oils for a wide range of applications. 

Literature 

The strategy of using chimeric gene products has been 
applied to study the structure and function of 
phosphotransferases in yeast (Hjelmstad et al - (1994) J. 
Biol. Chem. 269: 20995-21002) and restriction endonucleases 
of Flavobacterium Kim et al . (1994) Proc. Natl. Acad. Sci . 
USA. 51:883-887) . 

Domain swapping to rearrange functional domains of 
proteins has been used in protein engineering (Hedstrom 
(1994) Current Opinion in Structural Biology 4:608-611) . 

Recently the structure of a myristoyl-ACP thioesterase 
from Vibrio harveyi has been determined (Lawson et al. 
(1994) Biochemistry 33:9382-9388) . This thioesterase, like 
other bacterial or mammalian thioesterases, shares no 
sequence homology with plant thioesterases (Voelker et al. 
(1992) supra) . 

DESCRIPTION OF THE FIGURES 

Figure 1. An amino acid sequence alignment of 
representative Class I (FatA) and Class II (FatB) 
thioesterases is provided. UcFatBl (SEQ ID NO:l) is a 
California bay C12 thioesterase. CcFatBl (SEQ ID NO: 2) is 
a camphor C14 thioesterase. CpFatBl (SEQ ID NO: 3) is a 
Cuphea palustris C8 and CIO thioesterase. CpFatB2 (SEQ ID 
NO: 4) is a Cuphea palustris C14 thioesterase. GarmFatAl 
(SEQ ID NO: 5) is a mangos teen 18:1 thioesterase which also 
has considerable activity on C18 : 0 acyl-ACP substrates . 

4 

SUBSTITUTE SHEET (RULE 26) 



WO 96/36719 PCT/US96AI7064 

BrFatAl (SEQ ID NO: 6) is an 18:1 thioesterase from Brassica 
rapa (aka Brassica campestris) . Amino acid sequences which 
are identical in all of the represented thioesterases are 
indicated by bold shading. 
5 Figure 2. Results of thioesterase activity assays of 

wild- type bay (Figure 2A) and wild- type camphor (Figure 2B) 
thioesterases upon expression in E. coli is presented. 

Figure 3 . Nucleic acid and translated amino acid 
sequence of a PCR fragment (SEQ ID NO: 7) containing the 

10 encoding region for the mature protein portion of a camphor 
Class II acyl-ACP thioesterase is provided. 

Figure 4 . Nucleic acid and translated amino acid 
sequence (SEQ ID NO: 8) of a mangos teen Class I acyl-ACP 
thioesterase clone (GarmFatAl) is provided. GarmFatAl 

15 demonstrates primary thioesterase activity on 18:1 acyl-ACP 
substrate, but also demonstrates considerable activity on 
18:0 substrate (approximately 10-20% of 18:1 activity). 

Figure 5. Nucleic acid and translated amino acid 
sequence (SEQ ID NO: 9) of a mangos teen Class I acyl-ACP 

20 thioesterase clone, GarmFatA2, is provided. GarmFatA2 has 
thioesterase activity primarily on 18:1 acyl-ACP substrate, 
and equally low activity on 16:0 and 18:0 substrates. 

Figure 6. Nucleic acid and translated amino acid 
sequence (SEQ ID NO: 10) of a Cuphea palustris Class II 

25 acyl-ACP thioesterase clone (CpFatBl) having preferential 
activity on C8 and C10 acyl-ACP substrates is provided. 

Figure 7 . Nucleic acid and translated amino acid 
sequence (SEQ ID NO: 11) of a Cuphea palustris Class II 
acyl-ACP thioesterase clone (CpFatB2) having preferential 

30 activity on C14 acyl-ACP substrates is provided. 

Figure 8. An amino acid sequence comparison of bay 
(C12) (SEQ ID NO:l) and camphor (C14) (SEQ ID NO: 2) acyl- 
ACP thioesterases is provided. Amino acid residues which 
differ between the thioesterases are indicated by bold 

35 shading. 

Figure 9. Bay/camphor chimeric constructs, Ch-1 and 
Ch-2, are shown as in-frame fusions of N- and C- terminal 
portions of the thioesterases (from left to right) . The 

5 

SUBSTITUTE SHEET (RULE 26) 



BNSOOC1D <WO_9636719A1_1_: 



WO 96/367 19 PCT/US 96/07064 

Kpnl site used in constructing the chimeric constructs is 
shown . 

Figure 10. An amino acid sequence comparison of C. 
palustris CpFatBl (C8/C10) (SEQ ID NO: 3) and C. palustris 
5 CpFatB2 (C14) (SEQ ID NO:4) acyl-ACP thioesterases is 
provided- Amino acid residues which differ between the 
thioesterases are indicated by bold shading. 

Figure 11. Substrate specificities of the bay/camphor 
chimeric enzymes and two bay mutant thioesterases are 
0 provided (dark shaded columns). Control (E.coli 

transformed with vector alone) background activities are 
indicated by the light hatched columns. (A) Ch-1 (B) Ch-2 
(C) bay mutant M197R/R199H, and (D) bay mutant 
M197R/R199H/T231K. 

Figure 12. Relative thioesterase activity of wild- 
type (5247) and mutant Garcinia mangifera thioesterases 
(GarmFatAl) on 18:1, 18:0 and 16:0 acyl-ACP substrates are 
provided . 

Figure 13 . An amino acid sequence comparison of B. 
rapa BrFatAl <C18;1) (SEQ ID NO:6) and Garcinia mangifera 
GarmFatAl (C18 : 1/C18 : 0 ) (SEQ ID NO:5) acyl-ACP 
thioesterases is provided. Amino acid residues which 
differ between the thioesterases are indicated by bold 
shading . 

Figure 14. Short domain -swapping by PCR. The full- 
length gene is shown by two long, parallel lines. The 
hatched area represents the domain of interest. For each 
PCR primer (a, b, c, and d) , an arrow-head is pointing to 
the 3' end. Primers a and b are forward and reverse primers 
for the full-length DMA. The thin lines in primers c and d 
represent sequences that exactly match 3 « down-stream of 
the domain. The thick tails of primers c and d are the 5' 
overhangs corresponding to the new domain sequence. 

Figure 15. Long domain-swapping by PCR, Two PCR (PCR 
1 and 2) are carried out with gene I as template. A third 
PCR is performed simultaneously with gene II as template. 
Primers a and b are forward and reverse primers for the 
full-length gene I. Primer c matches the sequence immediate 

6 
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3' down- stream of the original domain in gene I. The thin 
line in primer d represents sequence that matches 3 ' down- 
stream of the original domain in gene I, whereas the thick 
tail matches the 3 1 end sequence of the replacement domain 
5 in gene II. Primer e primes the 5' end of the domain in 
gene II, while f primes the other end. The thin tail in 
primer f represents sequence that matches 3 • down-stream of 
the original domain in gene I . 

10 SUMMARY OF THE INVENTION 

By this invention, methods of producing engineered 
plant acyl-ACP thioesterases are provided, wherein said 
engineered plant acyl-ACP thioesterases demonstrate altered 
substrate specificity with respect to the acyl-ACP 

15 substrates hydrolyzed by the plant thioesterases as 

compared to the native acyl-ACP thioesterase . Such methods 
comprise the steps of (1) modifying a gene sequence 
encoding a plant thioesterase protein targeted for 
modification to produce one or more modified thioesterase 

20 gene sequences, wherein the modified sequences encode 

engineered acyl-ACP thioesterases having substitutions, 
insertions or deletions of one or more amino acid residues 
in the mature portion of the target plant thioesterase, (2) 
expressing the modified encoding sequences in a host cell, 

25 whereby engineered plant thioesterases are produced and, 
(3) assaying the engineered plant thioesterases to detect 
those having desirable alterations in substrate 
specificity. 

Of particular interest for amino acid alterations is 
30 the C-terminal two thirds portion of plant thioesterase, 
and more particularly, the region corresponding to amino 
acids 229 to 285 (consensus numbering above sequences) of 
plant thioesterase sequences as represented in the sequence 
alignment of Figure 1. Additionally, the region of from 
35 amino acid 285-312 is of interest for modification of 

thioesterase substrate specificity towards shorter chain 
fatty acids such as C8 and CIO . 

7 

SUBSTITUTE SHEET (RULE 26) 



BNSOOCtO <WO 963G7l9A1_t_> 



WO 96/367 1 9 PCT/US96/07064 

Useful information regarding potential modification 
sites in a targeted thioesterase may be obtained by 
comparison of related plant acyl-ACP thioesterase amino 
acid sequences, wherein the compared thioesterases 
5 demonstrate different hydrolysis activities. Comparisons 
of plant thioesterase amino acid sequences having at least 
75% sequence identity in the mature protein region are 
particularly useful in this regard. In this manner, amino 
acid residues or peptide domains which are different in the 
10 related thioesterases may be selected for mutagenesis. 

Other methods for selecting amino acids or peptide 
domains for modification include analysis of thioesterase 
protein sequences for predicted effects of substitutions, 
insertions or deletions on flexibility and/or secondary 
15 structure of the target thioesterase. 

In addition, useful thioesterase gene mutations may be 
discovered by random mutation of plant acyl-ACP 
thioesterase encoding sequences, followed by analysis of 
thioesterase activity or fatty acid composition to detect 
20 alterations in substrate specificity. 

To produce an engineered thioesterase, a DNA sequence 
encoding the thioesterase may be altered by domain swapping 
or mutagenesis, either random or site-directed, to 
introduce amino acid substitutions, insertions or 
25 deletions. The DNA sequences may then be expressed in host 
cells for production of engineered thioesterases and for 
analysis of resulting fatty acid compositions . Engineered 
thioesterases produced in this manner are also assayed to 
determine effects of the amino acid sequence modifications 
30 on the substrate specificity of the thioesterase. In this 
manner, novel thioesterases may be discovered which 
demonstrate a variety of profiles with respect to the 
carbon chain lengths of the acyl-ACP substrates which may 
be hydrolyzed or with respect to the relative activity of 
35 the thioesterase on different carbon chain length acyl-ACP 
substrates . 

Thus, DNA sequences and constructs for expression of 
engineered thioesterases, as well as the novel 
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thioesterases produced therefrom are also considered within 
the scope of the invention described herein. Such DNA 
sequences may be used for expression of the engineered 
thioesterases in host cells for the modification of fatty 
5 acid composition. Of particular interest in the instant 
invention are DNA constructs for expression of engineered 
thioesterases in plant cells, especially in plant seed 
cells of oilseed crop plants. As the result of expression 
of such constructs, plant triglyceride oil may be produced, 

10 wherein the composition of the oil reflects the altered 
substrate specificity of the engineered thioesterases. 
Thus, plant cells, seeds and plants comprising the 
constructs provided herein are all encompassed by the 
instant invention, as well as novel plant oils that may be 

15 harvested from the plant seeds. 

For example, a C12 preferring plant acyl-ACP 
thioesterase described herein may be altered to obtain a 
plant thioesterase having approximately equal activity on 
C14 and C12 substrates. Further modification of the C12 

20 enzyme yields a thioesterase having greater activity on C14 
as compared to C12 substrates. 

Also provided in the instant invention are novel plant 
acyl-ACP thioesterase sequences from Cuphea palustris and 
mangos teen {Garcinia mangifera) . The C. palustris 

25 sequence, CpFatBl, demonstrates substrate specificity 

towards C8 and CIO fatty acyl-ACPs with higher activity on 
C8. A mangosteen thioesterase gene, GarmFatAl, 
demonstrates primary activity on 18:1-ACP substrates, but 
also demonstrates substantial activity on 18:0-ACP. 

30 importantly, this clone does not demonstrate specificity 

for 16:0 substrates. Methods of modifying the specificity 
of these novel C8/C10 and C18:l/C18:0 plant thioesterases 
are also provided in the instant invention. In particular, 
mutations which increase the 18:0/18:1 activity ratio of 

35 the mangosteen clone are provided. 

DETAILED DESCRIPTION OF THE INVENTION 
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By this invention methods to produce engineered plant 
thioesterases having altered substrate specificity are 
provided. An engineered plant thioesterase of this 
invention may include any sequence of amino acids, such as 
5 a protein, polypeptide or peptide fragment obtainable from 
a plant source which demonstrates the ability to catalyze 
the production of free fatty acid(s) from fatty acyl-ACP 
substrates under plant enzyme reactive conditions. By 
"enzyme reactive conditions'' is meant that any necessary 
10 conditions are available in an environment (i.e., such 

factors as temperature, pH, lack of inhibiting substances) 
which will permit the enzyme to function. 

Engineered plant thioesterases may be prepared by 
random or specific mutagenesis of a thioesterase encoding 
15 sequence to provide for one or more amino acid 

substitutions in the translated amino acid sequence. 
Alternatively, an engineered plant thioesterase may be 
prepared by domain swapping between related plant 
thioesterases, wherein extensive regions of the native 
20 thioesterase encoding sequence are replaced with the 

corresponding region from a different plant thioesterase. 

Targets for domain swapping can include peptides 
ranging from five or six to tens of amino acids in length. 
In an ideal case, this type of interchange can be 
25 accomplished by the presence of unique, conserved 

restriction sites at the exact points of exchange in the 
genes encoding both proteins. Oligo-based mutagenesis 
(looping) may be applied when convenient restriction sites 
are not available, although this process may be time- 
30 consuming when large domain sequences are to be swapped. 
Alternatively, as described in the following Examples, a 
rapid method for domain swapping may be employed which is a 
modification of an overlap extension technique using 
polymerase chain reaction (PCR) described by Horton et aJ . 
35 (BioTechniques (1990) 5:528-535). The entire procedure can 
be done within six hours (time for two PCR runs) without in 
vivo manipulation. The basis for the overlap extension 
method is that in a PCR the primers must match their 

10 
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template sequence well enough to prime, but they need not 
match exactly, especially toward the 5' end. In fact, PCR 
primers with 5' overhangs (non-match sequences) are 
routinely used. The PCR-based domain swapping is designed 
5 for applications where the domain contains about six amino 
acids or less (short domain swapping) , or where domains 
containing much larger numbers of amino acids are to be 
swapped (long domain swapping) . 

Altered substrate specificities of an engineered 

10 thioesterase may be reflected by the presence of hydrolysis 
activity on an acyl-ACP substrate of a particular chain 
length which is not hydrolysed by the native thioesterase 
enzyme. The newly recognized acyl-ACP substrate may differ 
from native substrates of the enzyme in various ways, such 

15 as by having a shorter or longer carbon chain length 

(usually reflected by the addition or deletion of one or 
more 2 -carbon units) , by having a greater or lesser degree 
of saturation, or by the presence of a methyl group, such 
as in certain fatty acids which are not commonly present in 

20 plant cells, i.e. iso- and anti-iso- fatty acids. 
Alternatively, altered substrate specificity may be 
reflected by a modification of the relative hydrolysis 
activities on two or more acyl-ACP substrates of differing 
chain length and/or degree of saturation. 

25 DNA and amino acid sequence information for more than 

thirty plant acyl-ACP thioesterases is now available, and 
these sequences may be used in the methods of the instant 
invention to identify desirable regions for modification to 
produce sequences for expression of engineered 

30 thioesterases. 

Plant thioesterases can be classified into two classes 
by sequence homology. All of these plant thioesterases 
contain a transit peptide, of 60 to 80 amino acids in 
length, for plastid targeting. The transit peptides bear 

35 little homology between species while the mature protein 

regions (minus transit peptide) show significant amino acid 
sequence identity. 

11 

SUBSTITUTE SHEET (RULE 26) 



BNSOOCID <WO 9636719A1_I_ 



WO 96/367 1 9 PCT/US96/07064 

The first class, Class I (or FatA) includes long chain 
acyl-ACP thioesterases having activity primarily on 18:1- 
ACP. 18:1-ACP is the immediate precursor of most fatty 
acids found in phospholipids and tyiglycerides synthesized 
5 by the eukaryotic pathway. This class of thioesterase has 
been found in essentially all plant sources examined to 
date, and is suggested to be an essential "housekeeping" 
enzyme (Jones et al . (supra) required for membrane 
biosynthesis. Examples of Class I thioesterases from 

10 saf flower, Cuphea hookeriana and Brassica rapa 

(campestris) , which have activity primarily on 18:1-ACP 
substrate, have been described (WO 92/20236 and WO 
94/10288) . Other 18:1 thioesterases have been reported in 
Arabidopsis thaliana (Dormann et al . (1995) Arch. Biochem. 

15 Biophys. 316:612-618) , Brassica napus (Loader et al. (1993) 
Plant Mol. Biol. 23:769-778) and coriander (Dormann et al. 
(1994) Biochem. Biophys. Acta 1212:134-136). A similar 
18:1-ACP specific Class I thioesterase (GarmFatA2) has been 
discovered in developing embryos from mangos teen (Garcinia 

20 mangifera) , and is described herein. A Class I 

thioesterase from soybean (WO 92/11373) was reported to 
provide 10- and 96-fold increases in 16:0-ACP and 18:1-ACP 
activity upon expression in E. coli f and a smaller (3-4 
fold) increase in 18:0-ACP activity. The mature protein 

25 regions of Class I plant thioesterases are highly 

homologous, demonstrating greater than 80% sequence 
identity. 

In addition, another mangosteen Class I thioesterase 
(GarmFatAl) , also described herein, has been discovered 

30 which demonstrates thioesterase activity primarily on 18:1- 
ACP substrates (100-fold increase upon expression in E. 
coli) , but also demonstrates selective activity on 18:0-ACP 
versus 16:0-ACP. The 18:0 activity of GarmFatAl is 
approximately 25% of the 18:1 activity, whereas in most 

35 Class I thioesterases analyzed to date, the 18:1 activity- 
is highly predominant, with activity on 16:0 and 18:0 
substrates detectable at less than 5% of the 18:1 activity 
levels. 
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A second class of plant thioesterases , Class II (or 
FatB) thioesterases, includes enzymes that utilize fatty 
acids with shorter chain-lengths, from C8 : 0 to C14:0 
(medium chain fatty acids) as well as C16:0. Class II 
5 thioesterases preferably catalyze the hydrolysis of 

substrates containing saturated fatty acids. Class II (or 
FatB) thioesterases have been isolated from California Bay, 
elm, Cuphea hooker iana t Cuphea palustris , Cuphea 
lanceolata, nutmeg, Arajbidopsis thaliana, mango, leek and 

10 camphor. The mature protein regions of Class II plant 

thioesterases are also highly homologous, demonstrating 70- 
80% sequence identity. 

One of the characteristics of Class II thioesterases 
is the presence of a relatively hydrophobic region of 

15 approximately 40 amino acids in the N- terminal region of 
the mature proteins. This hydrophobic region is not found 
in 18:1-ACP thioesterases, and has no apparent effect on 
the enzyme activity. Recombinant expression of a bay Class 
II thioesterase with or without this region showed 

20 identical activity profiles in vitro (Jones et al. 
{supra) ) . 

As demonstrated more fully in the following examples, 
the acyl-ACP substrate specificity of plant thioesterases 
may be modified by various amino acid changes to the 

25 protein sequence, such as amino acid substitutions, 

insertions or deletions in the mature protein portion of 
the plant thioesterases. Modified substrate specificity 
can be detected by expression of the engineered plant 
thioesterases in E. coli and assaying to detect enzyme 

30 activity. 

Modified substrate specificity may be indicted by a 
shift in acyl-ACP substrate preference such that the 
engineered thioesterase is newly capable of hydrolysing a 
substrate not recognized by the native thioesterase. The 

35 newly recognized substrate may vary from substrates of the 
native enzyme by carbon chain length and/or degree of 
saturation of the fatty acyl portion of the substrate. 
Alternatively, modified substrate specificity may be 
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reflected by a shift in the relative thioesterase activity 
on two or more substrates of the native thioesterase such 
that an engineered thioesterase exhibits a different order 
of preference for the acyl-ACP substrates. 
5 For example, a plant thioesterase having primary 

hydrolysis activity on C12 substrate and some minor 
activity on C14 substrate may be modified to produce an 
engineered thioesterase which exhibits increased activity 
on C14, for example so that the engineered thioesterase has 

10 approximately equal activity on C12 and C14 substrates. 
Similarly, such plant C12 thioesterases may be further 
modified to produce an engineered thioesterase having 
primary activity on C14 substrates and little or no 
activity on C12 substrates. Alternatively, a plant 

15 thioesterase may be modified so as to alter the relative 

activity towards a substrate having higher or lesser degree 
of saturation. For example, a Class I (18:1) thioesterase 
may be modified to increase the relative activity on C18 : 0 
substrates as compared to activity on other substrates of 

20 the enzyme, such as C18:l and C16:0. Examples of these 
types of thioesterase modifications are provided in the 
following examples . Further modification of plant 
thioesterases are also desirable and may be obtained using 
the methods and sequences provided herein. For example, 

25 plant thioesterases may be modified to shift the enzymatic 
activity towards hydrolysis of shorter chain fatty acids, 
such as C8 and CIO . Comparison of closely related 
thioesterase sequences, such as the C. palustris C8/10, the 
C. palustris C14 and the C. hooker iana C8/10 thioesterase 

30 sequences provided herein may be used to identify potential 
target amino acid residues for alteration of thioesterase 
specificity. 

In initial experiments aimed at altering substrate 
specificity of plant thioesterase enzymes, two highly 
35 related Class II thioesterases were studied, a C12 

preferring acyl-ACP thioesterase from California bay 
(Uwbellularia californica) and a C14 preferring acyl-ACP 
thioesterase from camphor (Cinnamomum camphor a) . These 

14 
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enzymes demonstrate 90% amino acid sequence identity in the 
mature protein region yet have different substrate 
specificities. Constructs for expression of chimeric 
mature thioesterases were prepared which encoded chimeric 
5 thioesterase enzymes containing the N-terminal mature 

protein region of either the camphor or bay thioesterase 
and the C-terminal portion of the other thioesterase. The 
N-terminal thioesterase portion as encoded in these 
constructs contains approximately one third of the mature 

10 thioesterase protein, and the C-terminal portion contains 

the remaining two thirds of the mature thioesterase region. 
As described in more detail in the following examples, we 
have discovered that the C-terminal two thirds portion of 
these plant thioesterases is critical in determining the 

15 substrate specificity. The chimeric enzyme containing the 
C-terminal portion of the camphor thioesterase (Ch-1) 
demonstrates the same activity profile as native camphor 
thioesterase (specific for 14:0), and the chimeric protein 
with the bay thioesterase C-terminus (Ch-2) demonstrates 

20 the same activity profile as native bay thioesterase (12:0 
specific) - 

Additional studies of the C-terminal end of the 
protein were conducted to further locate regions of 
thioesterase proteins critical for substrate specificity. 

25 In one such study, the 13 consecutive C-terminal amino 

acids of the bay thioesterase were deleted by production of 
a mutant gene lacking the coding DNA for this region. The 
activity of the expressed mutant thioesterase was compared 
to an expressed wild- type bay thioesterase protein. The 

30 activity profiles of the 17 C-terminal meutant and the wild 
type bay thioesterase proteins were the same, demonstrating 
that the very C-terminal end of thioesterase proteins is 
not a critical region for substrate specificity. 

Further analysis of the C-terminal two thirds portion 

35 of the bay C12 preferring acyl-ACP thioesterase was 

conducted to identify particular amino acids involved in 
substrate specificity. By examining a sequence alignment 
of the bay and camphor thioesterases, the least 

15 
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conservative amino acid substitutions between the two 
thioesterases in the C-terminal two thirds portion of the 
proteins were identified. Non-conservative amino acid 
substitutions include those in which the substituted amino 
5 acid has a different charge than the native amino acid 
residue- Amino acids considered as having positively 
charged side chains at pH 7 are lysine and arginine . 
Histidine can also have a positively charged side chain 
under conditions of acidic pH. Amino acids considered as 
10 having negatively charged side chains at pH 7 are aspartate 
and glutamate. Non-conservative amino acid substitutions 
may also be indicated where the size of the substituted 
amino acid differs considerably from the size of the amino 
acid normally located at that position. Examples of non- 
15 conserved amino acid differences between the bay and 

camphor thioesterases are M197 -> R (Bay TE -> Camphor TE) , 
R199 -> H, T231 -> K, A293 -> D, R327 -> Q, P380-> S,and 
R3 81 -> S (amino acid sequence numbering for bay and 
camphor thioesterases is shown in Figure 8) . 
20 Secondary structure predictions may be used to 

identify amino acid substitutions likely to have affects on 
the secondary structure of the thioesterase protein. For 
example, according to secondary structure predictions using 
methods of Chou and Fasman, the tripeptide M-R-R amino 
25 acids 197-199 of bay and the corresponding tripeptide R-R-H 
of camphor are located behind a £-sheet and a turn anchored 
by two highly conserved glycines (G193 and G196) . This 
region of plant thioesterases is highly conserved, and the 
E-sheet and a turn structure is also predicted in other 
30 plant thioesterases. 

As described in the following examples, when the bay 
M-R-R tripeptide is changed to R-R-H, mimicking the 
sequence in camphor thioesterase, the activity of the 
mutant towards 12:0, but not 14:0, is reduced about 7 fold 
35 compared to the wild type. This results in an engineered 
thioesterase which has approximately equal specific 
activity with respect to the 12:0 and 14:0 substrates. 
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An additional modification of the engineered bay 
M197R/R199H thioesterase which converts the threonine 
residue at amino acid 231 to a lysine (T231K) alters the 
substrate specificity such that the engineered thioesterase 
5 M197R/R199H/T231K is highly 14:0-ACP specific. 

Interestingly, the mutation T231K alone does not affect the 
bay thioesterase activity. The non-additive, combinatorial 
effect of the T231K substitution on M197R/R199H engineered 
thioesterase suggests that the altered amino acid sites are 

10 folded close to each other (Sandberg, et al . (1993) Proc. 
Natl. Acad. Sci. 90: 8367-8371). 

As described in the following Examples, amino acid 
substitutions near the active site (YRREC, amino acids 357- 
3 61 in Figure 1 consensus numbering) of the plant acyl-ACP 

15 thioesterases may result in large reductions in 

thioesterase activity. Modification of bay thioesterase to 
produce R327Q results in a 100-fold decrease in the bay 
thioesterase activity. The decreased activity of R327Q is 
likely due to the fact that this amino acid position is 

20 located very close to the active site cysteine, C320 of the 
bay thioesterase sequence in Figure 8. 

Expression of engineered thioesterases having altered 
substrate specificities in host cells and analysis of 
resulting fatty acid compositions demonstrates that the 

25 altered substrate specificities of the engineered 

thioesterases are reflected in the fatty acid composition 
profiles of the host cells. This is significant because 
enzyme activity in vivo might have involved sequential 
interactions or parameters such as lifetime and 

30 folding /unfolding rates which would not be reflected in in 
vitro activity assays. The major lipid components of 
E.coli membranes are phosphatidyl-ethanolamine and 
phosphatidylglycerol, which contain predominantly long- 
chain fatty acyl moieties. Recombinant expression of 

35 native bay thioesterase cDNA in fadD cells redirects the 
bacterial type II fatty acid synthase system from long- 
chain to medium-chain production, and similar results are 
obtained upon expression of native bay thioesterase in 
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seeds of transgenic plants (Voelker et al . (1994) supra; 
Voelker et al . (1992) supra). Thus, E. coli in vivo data 
may be used to predict the effects of expression of 
engineered thioesterases in transgenic plants. 

With native bay thioesterase , E. coli fadD cells 
produce large amounts of 12:0 free fatty acid and small 
amounts of 14:0 (about 5 to 10% of 12:0 levels) (Voelker et 
al. (1994) and Table I) . However, as demonstrated in the 
following examples, following two amino acid substitutions 
(M197R/R199H) , expression of an engineered bay thioesterase 
enzyme results in accumulation of similar amounts of 12:0 
and 14:0 fatty acids. Similarly, expression of the 
engineered bay thioesterase with three amino acid 
substitutions (M197R/R199H/T23 IK) completely reverses the 
12:0/14:0 ratio of fatty acids produced as compared to 
results with native bay thioesterase. 

Thus, as the result of modifications to the substrate 
specificity of plant thioesterases, it can be seen that the 
relative amounts of the fatty acids produced in a cell 
where various substrates are available for hydrolysis may 
be altered. Furthermore, molecules which are formed from 
available free fatty acids, such as plant seed 
triglycerides, may also be altered as a result of 
expression of engineered thioesterases having altered 
substrate specificities. 

In addition to known acyl-ACP thioesterases and 
encoding sequences, such as provided herein, other acyl-ACP 
thioesterase sequences may be obtained from a variety of 
plant species, and such thioesterases and encoding 
sequences will find use in the methods of this invention. 
As noted above, plant thioesterase encoding sequences are 
highly conserved, particularly for those thioesterases 
which are members of the same class of thioesterase, i.e. 
Class I or Class II. Thus, for isolation of additional 
thioesterases, a genomic or other appropriate library 
prepared from a candidate plant source of interest is 
probed with conserved sequences from one or more Class I or 
Class II plant thioesterase sequences to identify 
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homologously related clones. Positive clones are analyzed 
by restriction enzyme digestion and/or sequencing. Probes 
can also be considerably shorter than the entire sequence. 
Oligonucleotides may be used, for example, but should be at 
5 least about 10, preferably at least about 15, more 

preferably at least 20 nucleotides in length. When shorter 
length regions are used for comparison, a higher degree of 
sequence identity is required than for longer sequences. 
Shorter probes are often particularly useful for polymerase 

10 chain reactions (PCR) (Gould, et al., PNAS USA (1989) 
£6:1934-1938), especially for isolation of plant 
thioesterases which contain highly conserved sequences. 
PCR using oligonucleotides to conserved regions of plant 
thioesterases may also be used to generate homologous 

15 probes for library screening. 

When longer nucleic acid fragments are employed (>100 
bp) as probes, especially when using complete or large cDNA 
sequences, one can still screen with moderately high 
stringencies (for example using 50% formamide at 37°C with 

20 minimal washing) in order to obtain signal from the target 
sample with 20-50% deviation, i.e., homologous sequences. 
(For additional information regarding screening techniques 
see Beltz, et al. Methods in Enzymology (1983) 100:266- 
285.) . 

25 The nucleic acid or amino acid sequences encoding an 

engineered plant acyl-ACP thioesterase of this invention 
may be combined with other non-native, or "heterologous", 
sequences in a variety of ways. By "heterologous" 
sequences is meant any sequence which is not naturally 

30 found joined to the plant acyl-ACP thioesterase, including, 
for example, combinations of nucleic acid sequences from 
the same plant which are not naturally found joined 
together. 

For expression in host cells, sequence encoding an 
35 engineered plant thioesterase is combined in a DMA 
construct having, in the 5' to 3 1 direction of 
transcription, a transcription initiation control region 
capable of promoting transcription and translation in a 
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host cell, the DNA sequence encoding the engineered plant 
acyl-ACP thioesterase and a transcription and translation 
termination region. 

DNA constructs may or may not contain pre-processing 
sequences, such as transit peptide sequences. Transit 
peptide sequences facilitate the delivery of the protein to 
a given organelle and are cleaved from the amino acid 
moiety upon entry into the organelle, releasing the 
"mature" sequence. The use of the precursor plant acyl-ACP 
thioesterase DNA sequence is preferred in plant cell 
expression cassettes. Other plastid transit peptide 
sequences, such as a transit peptide of seed ACP, may also 
be employed to translocate plant acyl-ACP thioesterases to 
various organelles of interest. 

Thus, engineered plant thioesterase sequences may be 
used in various constructs, such as for expression of the 
thioesterase of interest in a host cell for recovery or 
study of the enzyme in vitro or in vivo. Potential host 
cells include both prokaryotic and eukaryotic cells. A 
host cell may be unicellular or found in a multicellular 
differentiated or undifferentiated organism depending upon 
the intended use. Cells of this invention may be 
distinguished by having an engineered plant acyl-ACP 
thioesterase present therein. 

Depending upon the host, the regulatory regions will 
vary, including regions from viral, plasmid or chromosomal 
genes, or the like. For expression in prokaryotic or 
eukaryotic microorganisms, particularly unicellular hosts, 
a wide variety of constitutive or regulatable promoters may 
be employed. Expression in a microorganism can provide a 
ready source of the engineered plant enzyme and is useful 
for identifying the particular characteristics of such 
enzymes. Among transcriptional initiation regions which 
have been described are regions from bacterial and yeast 
hosts, such as E. coli, B. subtilis, Saccharoiayces 
cerevisiae, including genes such as beta-galactosidase, T7 
polymerase, tryptophan E and the like. 
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For the most part, the constructs will involve 
regulatory regions functional in plants which provide for 
expression of the plant acyl-ACP thioesterasc, and thus 
result in the modification of the fatty acid composition in 
5 plant cells. The open reading frame, coding for the plant 
acyl-ACP thioesterase will be joined at its 5' end to a 
transcription initiation regulatory region such as the 
wild-type sequence naturally found 5' upstream to the 
thioesterase structural gene. Numerous other transcription 

10 initiation regions are available which provide for a wide 
variety of constitutive or regulatable, e.g., inducible, 
transcription of the structural gene functions. Among 
transcriptional initiation regions used for plants are such 
regions associated with the structural genes such as for 

15 nopaline and mannopine synthases, or with napin, ACP 

promoters and the like. The transcription/ translation 
initiation regions corresponding to such structural genes 
are found immediately 5' upstream to the respective start 
codons. In embodiments wherein the expression of the 

20 engineered thioesterase protein is desired in a plant host, 
the use of part of the native plant acyl-ACP thioesterase 
gene is considered. Namely, all or a portion of the 5 • 
upstream non-coding regions (promoter) together with 3 ' 
downstream non-coding regions may be employed. If a 

25 different promoter is desired, such as a promoter native to 
the plant host of interest or a modified promoter, i.e., 
having transcription initiation regions derived from one 
gene source and translation initiation regions derived from 
a different gene source (enhanced promoters), such as 

30 double 35S CaMV promoters, the sequences may be joined 
together using standard techniques . 

For such applications when 5 ' upstream non-coding 
regions are obtained from other genes regulated during seed 
maturation, those preferentially expressed in plant embryo 

35 tissue, such as ACP and napin-derived transcription 

initiation control regions, are desired. Such "seed- 
specific promoters" may be obtained and used in accordance 
with the teachings of U.S. Serial No. 07/147,781, filed 
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1/25/88 (now U.S. Serial No. 07/550,804, filed 7/9/90), and 
U.S. Serial No. 07/494,722 filed on or about March 16, 1990 
having a title "Novel Sequences Preferentially Expressed In 
Early Seed Development and Methods Related Thereto, " which 
5 references are hereby incorporated by reference. 

Transcription initiation regions which are preferentially 
expressed in seed tissue, i.e., which are undetectable in 
other plant parts, are considered desirable for fatty acid 
modifications in order to minimize any disruptive or 

10 adverse effects of the gene product. 

Regulatory transcript termination regions may be 
provided in DNA constructs of this invention as well. 
Transcript termination regions may be provided by the DNA 
sequence encoding the plant acyl-ACP thioesterase or a 

15 convenient transcription termination region derived from a 
different gene source, for example, the transcript 
termination region which is naturally associated with the 
transcript initiation region. Where the transcript 
termination region is from a different gene source, it will 

20 contain at least about 0.5 kb, preferably about 1-3 kb of 
sequence 3 ' to the structural gene from which the 
termination region is derived. 

Plant expression or transcription constructs having a 
plant acyl-ACP thioesterase as the DNA sequence of interest 

25 may be employed with a wide variety of plant life, 

particularly, plant life involved in the production of 
vegetable oils for edible and industrial uses. Most 
especially preferred are temperate oilseed crops. Plants 
of interest include, but are not limited to, rapeseed 

30 (Canola and High Erucic Acid varieties) , sunflower, 

saf flower, cotton, Cuphea, soybean, peanut, coconut and oil 
palms, and corn. Depending on the method for introducing 
the recombinant constructs into the host cell, other DNA 
sequences may be required. Importantly, this invention is 

35 applicable to dicotyledon and monocotyledon species alike 
and will be readily applicable to new and/or improved 
transformation and regulation techniques. 
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The method of transformation is not critical to the 
instant invention; various methods of plant transformation 
are currently available. As newer methods are available to 
transform crops, they may be directly applied hereunder. 
For example, many plant species naturally susceptible to 
Agrobac terium infection may be successfully transformed via 
tripartite or binary vector methods of Agrobac terium 
mediated transformation. In addition, techniques of 
microinjection, DNA particle bombardment, electroporation 
have been developed which allow for the transformation of 
various monocot and dicot plant species. 

In developing the DNA construct, the various 
components of the construct or fragments thereof will 
normally be inserted into a convenient cloning vector which 
15 is capable of replication in a bacterial host, e.g., E. 

coli. Numerous vectors exist that have been described in 
the literature. After each cloning, the plasmid may be 
isolated and subjected to further manipulation, such as 
restriction, insertion of new fragments, ligation, 
20 deletion, insertion, resection, etc., so as to tailor the 
components of the desired sequence. Once the construct has 
been completed, it may then be transferred to an 
appropriate vector for further manipulation in accordance 
with the manner of transformation of the host cell. 
25 Normally, included with the DNA construct will be a 

structural gene having the necessary regulatory regions for 
expression in a host and providing for selection of 
transformant cells. The gene may provide for resistance to 
a cytotoxic agent, e.g. antibiotic, heavy metal, toxin, 
30 etc., complementation providing proto trophy to an 

auxotrophic host, viral immunity or the like. Depending 
upon the number of different host species the expression 
construct or components thereof are introduced, one or more 
markers may be employed, where different conditions for 
35 selection are used for the different hosts. 

It is noted that the degeneracy of the DNA code 
provides that some codon substitutions are permissible of 
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DNA sequences without any corresponding modification of the 
amino acid sequence. 

The manner in which the DNA construct is introduced 
into the plant host is not critical to this invention. Any 
method which provides for efficient transformation may be 
employed. Various methods for plant cell transformation 
include the use of Ti- or Ri-plasmids , microinjection, 
electroporation, DNA particle bombardment, liposome fusion, 
DNA bombardment or the like. In many instances, it will be 
desirable to have the construct bordered on one or both 
sides by T-DNA , particularly having the left and right 
borders, more particularly the right border. This is 
particularly useful when the construct uses A. tumefaciens 
or A . rhizogenes as a mode for transformation, although the 
T-DNA borders may find use with other modes of 
transformation . 

Where Agrobacterium is used for plant cell 
transformation, a vector may be used which may be 
introduced into the Agrobacterium host for homologous 
recombination with T-DNA or the Ti- or Ri-plasmid present 
in the Agrobacterium host. The Ti- or Ri-plasmid 
containing the T-DNA for recombination may be armed 
(capable of causing gall formation) or disarmed (incapable 
of causing gall formation), the latter being permissible, 
so long as the vir genes are present in the transformed 
Agrobacterium host. The armed plasmid can give a mixture 
of normal plant cells and gall . 

In some instances where Agrobacterium is used as the 
vehicle for transforming plant cells, the expression 
construct bordered by the T-DNA border (s) will be inserted 
into a broad host spectrum vector, there being broad host 
spectrum vectors described in the literature. Commonly 
used is pRK2 or derivatives thereof. See, for example, 
Ditta et al., PNAS USA, (1980) 77:7347-7351 and EPA 0 120 
515, which are incorporated herein by reference. Included 
with the expression construct and the T-DNA will be one or 
more markers, which allow for selection of transformed 
Agrobacterium and transformed plant cells. A number of 
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markers have been developed for use with plant cells, such 
as resistance to chloramphenicol, the aminoglycoside G418, 
hygrornycin, or the like. The particular marker employed is 
not essential to this invention, one or another marker 
5 being preferred depending on the particular host and the 
manner of construction. 

Once a transgenic plant is obtained which is capable 
of producing seed having a modified fatty acid composition, 
traditional plant breeding techniques, including methods of 

10 mutagenesis, may be employed to further manipulate the 

fatty acid composition. Alternatively, additional foreign 
fatty acid modifying DNA sequence may be introduced via 
genetic engineering to further manipulate the fatty acid 
composition. It is noted that the method of transformation 

15 is not critical to this invention. However, the use of 

genetic engineering plant transformation methods, i.e., the 
power to insert a single desired DNA sequence, is critical. 
Heretofore, the ability to modify the fatty acid 
composition of plant oils was limited to the introduction 

20 of traits that could be sexually transferred during plant 
crosses or viable traits generated through mutagenesis. 
Through the use of genetic engineering techniques which 
permits the introduction of inter-species genetic 
information and the means to regulate the tissue-specific 

25 expression of endogenous genes, a new method is available 
for the production of plant seed oils with modified fatty 
acid compositions. In addition, there is the potential for 
the development of novel plant seed oils upon application 
of the tools described herein. 

30 One may choose to provide for the transcription or 

transcription and translation of one or more other 
sequences of interest in concert with the expression of an 
engineered plant acyl-ACP thioesterase in a plant host 
cell- In particular, the expression of a plant LPAAT 

35 protein having activity on medium-chain or very long-chain 
fatty acids in combination with expression of an engineered 
plant acyl-ACP thioesterase may be preferred in some 
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applications. See WO 95/27791 for plant LPAAT encoding 
se<juenc&s . 

When one wishes to provide a plant transformed for the 
combined effect of more than one nucleic acid sequence of 
5 interest, typically a separate nucleic acid construct will 
be provided for each. The constructs, as described above 
contain transcriptional or transcriptional or 
transcriptional and translational regulatory control 
regions. One skilled in the art will be able to determine 

10 regulatory sequences to provide for a desired timing and 
tissue specificity appropriate to the final product in 
accord with the above principles set forth as to the 
respective expression or anti-sense constructs. When two 
or more constructs are to be employed, whether they are 

15 both related to the same fatty acid modifying sequence or a 
different fatty acid modifying sequence, it may be desired 
that different regulatory sequences be employed in each 
cassette to reduce spontaneous homologous recombination 
between sequences. The constructs may be introduced into 

20 the host cells by the same or different methods, including 
the introduction of such a trait by crossing transgenic 
plants via traditional plant breeding methods, so long as 
the resulting product is a plant having both 
characteristics integrated into its genome. 

25 The invention now being generally described, it will 

be more readily understood by reference to the following 
examples which are included for purposes of illustration 
only and are not intended to limit the present invention. 



30 EXAMPLES 

Example 1 Sequences of Plant Acyl-ACP Thioesterases 
A. California Bay {Umbellulajria californica) 

DNA sequence and translated amino acid sequence of 
California bay Class II thioesterase clone pCGN3822 is 
35 provided in Figure 1 of WO 92/20236. Expression of the 

mature portion of the bay thioesterase protein in E. coli 
and analysis of thioesterase activity reveals a strong 
specificity of the bay thioesterase for 12:0-ACP substrate, 
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although some activity towards 14:0-ACP is also observed 
(Voelker et al. (1994) supra, and Figure 2A herein). 
Furthermore, when bay thioesterase is expressed in E. coli 
fadD cells, large amounts of laurate (more than 500-fold 
5 above control background) and small amounts of myristate 

(about 10% of that of laurate) are produced. Production of 
similar ratios of laurate and myristate are also observed 
upon expression of the bay thioesterase in seeds of 
Brassica napus or Arabidopsis thaliana (Voelker et al. 
10 (1992) supra). 

B. Camphor (Cinnamomum camphora) 

DNA sequence and translated amino acid sequence of a 
Class II camphor thioesterase encoding region generated by 
PCR is provided in Figure 5B of WO 92/20236. Sequence (SEQ 
15 ID N0:7) of a DNA fragment obtained by PCR from reverse 

transcribed cDNA and containing the mature protein region 
of the camphor clone is provided in Figure 3. The 
sequence begins at the Xbal site located at the beginning 
of the presumed mature protein encoding region of the 
20 camphor thioesterase. 

The camphor PCR fragment described above is cloned 
into a pAMP vector resulting in pCGN5219 . pCGN5219 is 
digested with Xbal and Sail and the resulting camphor 
thioesterase fragment is cloned into Xbal and Sail digested 
25 pBCSK+ (Stratagene) . resulting inpCGN5220. pCGN5220 is 
used to transform E. coli fadD for analysis of acyl-ACP 
thioesterase activity as described in Pollard et al. (Arch. 
Biochem & Eiophys. (1991) 281:306-312) . Results of 
thioesterase activity assays on camphor thioesterase clones 
30 using 8:0, 10:0, 12:0, 14:0, 16:0, 18:0 and 18:1 acyl-ACP 

substrates demonstrate substrate specificity mainly on 14:0 
substrates, although a lesser increase in 12:0 hydrolysis 
activity is also observed (Fig. 2B) . 
C. Mangos teen (Garcinia mangifera) 
35 A cDNA bank is prepared from seeds extracted from 

mature mangosteen fruit using the methods as described in 
Stratagene Zap cDNA synthesis kit (Stratagene; La Jolla, 
CA) . Oil analysis of the mangosteen tissues used for RNA 
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isolation reveals 18:0 levels of approximately 50%. Oil 
analysis of seeds from less mature mangosteen fruit reveals 
18:0 levels of 20-40%. Total RNA is isolated from the 
mangosteen seeds by modifying the CTAB DNA isolation method 
of Webb and Knapp (Plant Mol . Biol. Reporter (1990) 5:180- 
195) . Buffers include: 

REC: 50 mM TrisCl pH 9, 0.7 M NaCl, 10 mM EDTA pH8, 
0.5% CTAB. 

REC+ : Add B-mercaptoethanol to 1% immediately prior 
to use. 



RECP: 50 mM TrisCl pH9 , 10 mM EDTA pH8 , and 0.5% 
15 CTAB. 

RECP+ : Add B-mercaptoethanol to 1% immediately prior 
to use. 

20 For extraction of 1 g of tissue, 10ml of REC+ and 0.5 

g of PVPP is added to tissue that has been ground in liquid 
nitrogen and homogenized. The homogenized material is 
centrifuged for 10 min at 12000 rpm. The supernatant is 
poured through miracloth onto 3ml cold chloroform and 

25 homogenized again. After centrifugation, 12,000 RPM for 10 
min, the upper phase is taken and its volume determined. An 
equal volume of RECP+ is added and the mixture is allowed 
to stand for 20 min. at room temperature. The material is 
centrifuged for 20 min. at 10,000 rpm twice and the 

30 supernatant is discarded after each spin. The pellet is 
dissolved in 0.4 ml of 1 M NaCl (DEPC) and extracted with 
an equal volume of phenol/chloroform. Following ethanol 
precipitation, the pellet is dissolved in 1 ml of DEPC 
water . 

35 Briefly, the cloning method for cDNA synthesis is as 

follows. First strand cDNA synthesis is according to 
Stratagene Instruction Manual with some modifications 
according to Robinson, et al. (Methods in Molecular and 
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Cellular Biology (1992) 3:118-127). In particular, 
approximately 57p.g of LiCl precipitated total RNA was used 
instead of 5}lg of poly (A) + RNA and the reaction was 
incubated at 45°C rather than 37°C for 1 hour. 
5 Probes for library screening are prepared by PCR from 

mangosteen cDNA using oligonucleotides to conserved plant 
acyl-ACP thioesterase regions. Probe Garm 2 and Garm 106 
are prepared using the following oligonucleotides. The 
nucleotide base codes for the below oligonucleotides are as 
10 follows: 



15 



20 



A 
T 
G 

K 
M 
Y 
B 
H 
N 



adenine 
thymine 
guanine 



C = cytosine 
U = uracil 

S - guanine or cytosine 



guanine or thymine W 
adenine or cytosine R 
cytosine or thymine 
guanine, cytosine or thymine 
adenine, cytosine or thymine 
adenine, cytosine, guanine or thymine 



adenine or thymine 
adenine or guanine 



Garm 2 

4874: 5' CUACUACUACUASYNTVNGYNATGATGAA 3' (SEQ ID NO: 12) 
25 4875: 5 ' CAUCAUCAUCAURCAYTCNCKNCKRTANTC 3' (SEQ ID NO: 13) 

Primer 4874 is a sense primer designed to correspond to 
possible encoding sequences for conserved peptide 
V/L/A W/S/Y V/A M M N, where the one letter amino acid code 
is used and a slash between amino acids indicates more than 
30 one amino acid is possible for that position. Primer 4875 
is an antisense primer designed to correspond to possible 
encoding sequences for peptide D/E Y R R E C. 

Garm 106 

35 5424: 5' AUGGAGAUCUCUGAWCRBTAYCCTAMHTGGGGWGA 3' (SEQ ID 

N0:14) 

5577: 5' ACGCGUACUAGUTTNKKNCKCCAYTCNGT 3' (SEQ ID NO: 15) 
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Primer 5424 is a sense primer designed to correspond to 
possible encoding sequences for peptide E/D H/R Y P K/T W G 
D. 

Primer 5577 is an antisense primer designed to correspond 
5 to possible encoding sequences for peptide T E W R K/P K. 

The DNA fragments resulting from the above reactions 
are amplified for use as probes by cloning or by further 
PCR and radiolabeled by random or specific priming. 

Approximately 800,000 plaques are plated according to 
10 manufacturer's directions. For screening, plaque filters 
are prehybridized at room temperature in 50% formamide, 5X 
SSC, 10X Denhardt's, 0.1% (w/v) SDS, 5mM Na 2 EDTA, O.lmg/ml 
denatured salmon sperm DNA. Hybridization with a mixture 
of the Garra 2 and Garm 106 probes is conducted at room 

15 temperature in the same buffer as above with added 10% (w/v) 
dextran sulfate and probe. Plaque purification and 
phagemid excision were conducted as described in Stratagene 
Zap cDNA Synthesis Kit instructions. 

Approximately 90 acyl-ACP thioesterase clones were 

20 identified and sorted as to thioesterase type by DNA 

sequencing and/or PCR analysis. Of the analyzed clones, at 
least 28 were Class I (FatA) types, and 59 were Class II 
(FatB) types. Two subclasses of FatA type clones were 
observed, the most prominent type is termed GarmFatAl and 

25 the single clone of the second subclass is termed 

GarmFatA2 . DNA and translated amino acid sequence of 
GarmFatAl clone C14-4 (pCGN5252) (SEQ ID NO: 8) is presented 
in Figure 4 . DNA sequence and translated amino acid 
sequence of the FatA2 clone C14-3 (SEQ ID NO: 9) is 

30 presented in Figure 5. 

Constructs for expression of the Figure 4 Garm FatAl 
clone in E. coll are prepared as follows. Restriction 
sites are inserted by PCR mutagenesis at amino acid 49 
(Sad) , which is near the presumed mature protein amino 

35 terminus, and following the stop codon for the protein 
encoding region (BazriHI) . The mature protein encoding 
region is inserted as a Sacl/Bartll fragment into pBC SK 
(Stratagene; La Jolla, CA) resulting in pCGN5247, which may 
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be used to provide for expression of the mangosteen 
thioesterase as a lacZ fusion protein. 

Results of thioesterase activity assays on mangosteen 
Class I thioesterase clone GarmFatAl using 16:0, 18:0 and 
5 18:1 acyl-ACP substrates are shown below. 

Acyl-ACP Thioesterase activity (cpm/min) 

16:0 18:0 18:1 

10 Control 1400 3100 1733 

GarmFatAl 4366 23916 87366 

The GarmFatAl cone demonstrates preferential activity on 
C18:l acyl-ACP substrate, and also demonstrates substantial 
15 activity (approximately 25% of the 18:1 activity) on C18 : 0 
acyl-ACP substrates. Only a small increase in C16:0 
activity over activity in control cells is observed, and 
the 16:0 activity represents only approximately 3% of the 
18:1 activity. 

20 Expression of GannFatA2 thioesterase in E. coli and 

assay of the resultant thioesterase activity demonstrates 
that C18:l is highly preferred as the acyl-ACP substrate. 
The thioesterase activity on 16:0 and 18:0 acyl-ACP 
substrates are approximately equal and represent less than 

25 5% of the observed 18:1 activity. 
D. Brassica campestris {rapa) 

DNA sequence and translated amino acid sequence of a 
Brassica campestris Class I acyl-ACP thioesterase are 
provided in WO 92/20236 (Figure 6) . 

30 E. Cuphea palustris C8/C10 

Total RNA is isolated from developing seeds of C. 
palustris using the modified CTAB procedure described 
above. A lambda ZipLox (BRL; Gaithersburg, MD) cDNA 
library containing approximately 6 X 10 6 pfu is constructed 

35 from total RNA. Approximately 500,000 plaques from the 
unamplif ied library are screened using a mixed probe 
containing the thioesterase coding regions from Cuphea 
hookeriana Class II thioesterase clones CUPH-1 (CMT-9) , 
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CUPH-2 (CMT-7) and CUPH-5 (CMT-10) . (DNA sequences of 
these clones are provided in WO 94/1028 8) . Low stringency 
hybridization conditions are used as follows: hybridization 
is conducted at room temperature in a solution of 30% 
5 formamide and 2X SSC (IX SSC = 0.15 M NaCl; 0.015 M Na 
citrate) . Eighty two putative positive clones were 
identified, thirty of which were plaque purified. The 
nucleic acid sequence and translated amino acid sequence of 
a clone designated as MCT29 (CpFatBl) (SEQ ID NO: 10) is 
10 provided in Figure 6. The translated amino acid sequence 
of this clone is approximately 83% identical to the 
sequence of a Cuphea hookeriana CUPH-2 clone (CMT-7 in 
Figure 7 of WO 94/10288) having primary thioesterase 
activity on C8:0 and C10:0 fatty acyl-ACP substrates. 
15 Constructs for expression of MCT29 in E. coli are 

prepared. SphI and Stul sites are inserted 5' to the 
presumed mature protein N-terminus located at amino acid 
114 by PCR. Mature N-terminus predicted by correspondence 
to Leu 84 originally identified as bay thioesterase mature 
20 protein N-terminus. The mature protein encoding region is 
cloned as a Stul/Xbal fragment into pUC118, resulting in 
clone MCT29LZ, to provide for expression of the C. 
palustris thioesterase in E. coli as a lacZ fusion protein. 
Lysates of transformed E. coli cells expressing the MCT29 
25 thioesterase protein are assayed for acyl-ACP thioesterase 
activity. The results demonstrate that CpFatBl encodes a 
thioesterase enzyme having activity primarily on C8- and 
C10-ACP substrates, with 50% higher activity on C8-ACP than 
on C10-ACP. Low activity on C14-ACP substrate is also 
30 observed at levels of approximately 10% of the C8-ACP 
activity. 

MCT29LZ is also transformed into E. coli fadD, an E. 
coli mutant which lacks medium-chain specific acyl-CoA 
synthetase (Overath et al. f Eur. J. Biochem (1969) 7:559- 
35 574) for analysis of lipid composition. Results of these 
analyses demonstrate a substantial increase in the 
production of 8:0 and 10:0 fatty acids in cells transformed 
with the C. palustris MCT29LZ clone. 
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The closely related C. hookeriana ChFatB2 clone also 
demonstrates preferential activity on C8 : 0 and C10:0 acyl- 
ACP substrates, with 50% higher activity on CIO : 0 as 
opposed to C8:0 substrates. Expression of the ChFatB2 
5 clone in seeds of transgenic Brassica plants results in 
increased production of C8 and CIO fatty acids in the 
seeds, with CIO levels higher than C8 levels. (See co- 
pending application SN 08/261,695 filed June 16, 1994.) 
F. Cuphea palustris C14 

10 The nucleic acid sequence and translated amino acid 

sequence of an additional C. palustris Class II 
thioesterase clone, MCT34 (CpFatB2) (SEQ ID N0:11), is 
provided in Figure 7. The translated amino acid sequence 
of this clone is approximately 80% identical to the 

15 sequence of a Cuphea hookeriana CUPH-4 clone (CMT-13 in 
Figure 8 of WO 94/10288). 

Constructs for expression of MCT34 in E. coli are 
prepared. SphI and StuI sites are inserted 5* to the 
presumed mature protein N-terminus located at amino acid 

20 108 by PCR. The mature protein encoding region is cloned 
as a StuZ/XbaT fragment into pUC118, resulting in clone 
MCT34LZ, to provide for expression of the C. palustris 
thioesterase in E. coli as a lacZ fusion protein. Lysates 
of transformed E. coli cells expressing the MCT34 

25 thioesterase protein are assayed for acyl-ACP thioesterase 
activity. The results demonstrate that CpFatB2 encodes a 
thioesterase enzyme having activity primarily on C14-ACP 
substrate. Activity on CI 6 -AC P substrate is also observed 
at levels of approximately 30% of the C14-ACP activity. 

30 MCT34LZ is also transformed into E. coli fadD, an E. 

coli mutant which lacks medium-chain specific acyl-CoA 
synthetase (Overath et al . , Eur. J. Biochem (1969) 7:559- 
574) for analysis of lipid composition. Results of these 
analyses demonstrate a substantial increase in the 

35 production of 14:0 and 14:1 fatty acids in cells 
transformed with the C. palustris MCT34LZ clone. 

Example 2 Chimeric Thioesterase Constructs 
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Both cDNA's of the bay and camphor thioesterases 
contain open reading frames encoding 3 82 amino acids. Only 
31 amino acids are different, among them more than half are 
conservative substitutions (Fig. 8) . The codon usage is 
5 highly conserved between the two genes, suggesting their 
the common origin. 

Plasmid pCGN3 823 (WO 92/20236 and Voelker et al . 
(1S94) supra) contains a 1.2-kb Xbal fragment of a bay C12 
preferring thioesterase cDNA in a pBS- (Stratagene; La 
10 Jolla, CA) plasmid backbone and encodes the mature bay 
thioesterase protein beginning at amino acid 84 (as 
numbered in Voelker et al . (1992) supra) . Amino acid 84 of 
the bay thioesterase was initially identified as the amino 
terminus for the mature protein based on amino acid 
15 sequence analysis of the purified protein. Comparison to 
translated amino acid sequences of other cloned plant 
medium-chain acyl-ACP thioesterases, however, indicates 
that the amino terminus may be located further upstream of 
the leu 84 residue (Jones et al. (1995) supra). Plasmid 
20 pCGN5220, described above, contains an Xhal/XhoT fragment 

of a camphor C14 preferring thioesterase cDNA inserted into 
pBC + plasmid (Stratagene) . The Xbal site in the camphor 
cDNA is present at amino acid residue 84, a leucine, as in 
the bay thioesterase encoding region. 
25 There is a conserved, unique Kpn I site in both the 

bay and camphor cDNA clones at amino acid residue 177 of 
the encoding sequence for the precursor bay and camphor 
thioesterases (Fig. 9) . A second Kpn I site is located 
within the polylinkers of the plasmids 3 ' to the stop 
30 codons of the thioesterase sequences. The interchange of 

the two Kpnl fragments between pCGN3823 and pCGN5220 allows 
the fusion of the N-terminal region of one thioesterase to 
the C- terminal region of the other, forming two chimeric 
enzymes . 

35 To prepare the chimeric constructs, pCGN3823 and 

PCGN5220 were digested with Kpnl and the resulting 
fragments gel-purified and ligated into the backbone 
plasmid from the opposite origin. DNA mini -preparations and 

34 

SUBSTITUTE SHEET (RULE 26) 



WO 96/36719 PCT/US96/07064 

restriction digestions were used to identify the correct 
fusion constructs. The chimeric constructs used for 
expression and enzyme assays were also confirmed by DNA 
sequencing. 

5 The resulting chimeric enzymes contain 92 amino acids 

from the N- terminal of one thioesterase and 207 amino acids 
from the C- terminal portion of the other. The fusion 
protein containing the C- terminal portion of the camphor 
thioesterase is referred to as Chimeric 1 (Ch-1), and the 
10 other fusion protein is called Chimeric 2 (Ch-2) (Fig. 9) . 

Example 3 Flexibility and Secondary Structure Analyses 

Predicted secondary structures of plant acyl-ACP 
thioesterases are determined be computer analysis. 

15 Secondary structure predictions are based on methods of 
Chou and Fasman (Chou et al . (1974) Biochem. 13:222-245; 
Prevelige et al. (1989) in Prediction of Protein Structure 
and the Principles of Protein Conformation (Fasman, G.D. 
ed.), pp 391-416, Plenum, New York); and Gamier et al. 

20 (1978) J. Mol. Biol. 120:97-120). 

Flexibility of various regions of plant acyl-ACP 
thioesterase regions are predicted by computer analysis 
using MacVector (International Biotechnologies, Inc.), 
based on flexibility prediction methods of Karplus and 

25 Schulz (Naturwiss. (1985) 72:212-213). 

Example 4 Engineering FatB Thioesterases 
A. Bay C12 Thioesterase 

PCR site-directed mutagenesis (Higuchi et al. (1988) 
30 Nucl. Acids Res. 16:7351-7367) is used for amino acid 
replacements. The sense mutant primers used for the 
mutagenesis are as follows: 

M197R/R199H 5 ' -GGAAATAATGGC£^CGACATGATTTCCTTGTCC-3 1 

35 (SEQ ID NO: 16) 

T2 3 IK 5 1 -GGTTGTCCAAAATCCC-3 1 ( SEQ ID NO : 17 ) 

R3 27Q 5 ' -GCGTGCTGCAGTCCCTGACC - 3 ' ( SEQ ID NO: 18) 
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R3 2 2M/R3 21 Q 5 - -GAGAGAGTGCACGATGGATAGCGTGCTGCAGTCCCTGACC- 
3 ' (SEQ ID NO: 19) 



where bold letters M, R, H, T, K and Q are one-letter 
5 abbreviations for amino acids methionine, arginine, 

histidine, threonine, lysine and glutamine respectively, 
and the mutated nucleotides are underlined. 

PCR conditions were as follows: five cycles of the PCR 
were programmed with denaturation for 1 min at 94°C, 

10 renaturation for 30 seconds at 48°C, and elongation for 2 
min at 72°C. These first five cycles were followed by 30 
cycles with renaturation for 30 seconds at 60°C. The 
amplified DNA was recovered by ethanol precipitation, and 
examined by gel electrophoresis. The DNA was then digested 

15 with Xfoal and BaJriHI, ethanol precipitated and ligated into 
Xhal/BamRX cut pBC plasmid. The ligation mixture was used 
to transform Sure cells (Stratagene) by electroporation, 
and the transformed cells were plated on LB medium 
containing 50 mg/1 of chloramphenicol. Constructs 

20 containing the correct inserts were identified by mini-DNA 
preparation and restriction digestion. The inserted DNA was 
sequenced to confirm the mutations. 

The same designations noted above for the PCR primers 
were used for the mutant clones. As an example, 

25 M197R/R199H refers to a clone in which the methionine at 

residue 197 (of precursor bay thioesterase) was changed to 
an arginine, and where the arginine at residue 199 was 
changed to a histidine. Similarly, T231K indicates a 
mutant in which the threonine at residue 231 was changed to 

30 a lysine. 

B. Cuphea palustris C14 Thioesterase 

To determine possible amino acid modifications for 
alteration of thioesterase substrate specificity towards 
shorter chain length fatty acyl-ACPs, sequences for C14:0 

35 preferring thioesterases may be compared to sequences for 
C8:0 and C10:0 preferring thioesterases. A comparison of 
amino acid sequences of thioesterase CpFatB2 (C14) to 
CpFatBl (C8/C10) is shown in Figure 10. The most striking 
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differences in these thioesterase sequences is found in 
amino acids 230 to 312. Substitutions, such as H229I, 
H241N, W253Y, E275A, R290G, F292L, L295F, and C304R, can 
be made in single- and combinatory- form. Alternatively, 
5 domain swapping clones may be prepared which provided for 
switching of portions of the C8/10 and C14 sequences. Of 
particular interest in this regard are sequences IEPQFV 
starting at amino acid 274, and DRKFHKL starting at amino 
acid 289. 

10 

Example 5 Specificity of Chimeric Enzymes and Bay Mutants 

Transformed E.coli cells in .ZacZ expression constructs 
are grown to 0.6 O.D. 600 at 30°C, followed by addition of 
ImM IPTG and continuous growth at 3 0°C for 2 hours. The 

15 sedimented cells were resuspended and sonicated in the 
assay buffer, and acyl-ACP hydrolysis is measured as 
previously described (Davies, H.M. (1993) Phytochemistry 
33, 1353-1356). Sure cells transformed with pCGN3823 and 
pBC served as positive and negative controls, respectively. 

20 Figure 11 shows the thioesterase specific activities 

of the chimeric bay/camphor enzymes when E.coli cells 
transformed with Ch-1 and Ch-2 were induced and assayed. 
For Ch-1 (Fig. 11A) the preferred substrate is 14: 0-ACP, 
whereas for Ch-2 (Fig. 11B) it is 12: 0-ACP. These results 

25 indicate that the C-terminal portion of the thioesterase 
protein determines the substrate specificity. 

The enzyme specificities of two of the bay mutants 
are shown in Fig. 11C and 11D. A mutant in which Met 197 
becomes an arginine and Argl99 becomes a histidine 

30 (M197R/R199H) results in altered specificity of the bay 
thioesterase such that the enzyme is equally specific 
towards both 12:0-ACP and 14 : 0-ACP substrates (Fig. 11C) . 
Another mutant, T231K, gives an identical activity profile 
as the wild type (data not shown) . However, the triple 

35 mutant M197R/R199H/T231K, which combines the three 

mutations, demonstrates 14: 0-ACP specific thioesterase 
activity (Fig. 11D) . When this triple mutant enzyme is 

37 

SUBSTITUTE SHEET (RULE 26) 



0NSOOCIO <WO 963C719A1_I_> 



WO 96/36719 PCI7US96/07064 

assayed at high concentration, very low levels of 12:0-ACP 
activity are detectable. 

Two more mutants (R327Q and R322M/R327Q) were also 
tested for thioesterase activity. Both mutants show 
5 identical activity profiles, and their specific activities 
toward 12:0-ACP and 14:0-ACP decrease about 100- and 30- 
fold, respectively, compared to the wild type bay 
thioesterase. These data indicate that the mutation R327Q 
is responsible for the decreased activity. Decreased 

10 activity of R327Q is likely due to the fact that this amino 
acid position is located very close to the active site 
cysteine, C320. Studies which demonstrated the catalytic 
activity of C320 were conducted as follows. C320 was 
changed by site-directed mutagenesis to either serine or 

15 alanine. The mutant C320A completely lost thioesterase 
activity, while C320S retained approximately 60% of the 
wild- type activity. Interchange of cysteine and serine in 
the active site has also been demonstrated for animal 
thioesterases (Witkowski et al . (1992) J. Biol. Chem. 

20 257:18488-18492). In animals, the active site is a serine, 
and the change thus was from serine to cysteine. 

Example 6 Expression of Bay Mutants in E. coli fadD Cells 
The E. coli fatty acid-degradation mutant strain K27 

25 (fadD88) , a strain lacking acyl-coenzyme A synthetase, is 
unable to utilize free fatty acids when they are supplied 
in the medium (Klein et aJL (1971) Eur. J. Biochem. 29:442- 
450) . Thus, it is an ideal host for observing the impact of 
recombinant thioesterases on the bacterial fatty acid 

30 synthase without interference from fatty acid degradation. 
£. coli fadD was obtained from the E. coli Genetic Stock 
Center, Yale University (CGSC 5478) . The fadD cells were 
transformed with either the pBC, a wild-type bay 
thioesterase gene or the mutant constructs, and grown 

35 overnight at 30°C in LB medium containing 50 mg/1 

chloramphenicol and 1 mM IPTG. Total lipids were analyzed 
as described previously (Voelker et al. (1994) supra) . 
Results of these analyses are presented in Table I below. 
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Table I 

Free Fatty Acid Accumulation (nmole/ml culture) 

5 Strain 12 : 0 14 : 0 

. Control* 0.3 1.6 

Bay Thioesterase 505.5 39.0 

M197R/R199H 123.5 181.1 



10 



M197R/R199H/T231K 35.4 352.9 

*fadD cells transformed with the pBC vector only. 



When bay thioesterase is expressed in fadD cells, 
large amounts of laurate (more than 5 00- fold above control 

15 background) and small amounts of rnyristate (about 10% of 
that of laurate) are produced (Table I) . This result is 
consistent with the previous report (Voelker et al. (1994) 
supra) . When mutant M197R/R199H is expressed in fadD cells, 
the ratio of 12:0 to 14:0 accumulation changes to 1:1.5 

20 (Table I) , reflecting the thioesterase specificity of this 
mutant (Fig.llC) . When mutant M197R/R199H/T231K is 
expressed in fadD cells, the ratio of 12:0 to 14:0 is 
completely reversed from that seen with the wild- type bay 
thioesterase. This result is also consistent with enzyme 

25 specificity of the mutant (Fig. 11D) . 

Example 7 Kinetic Analysis 

In order to gain insight into the impact of the 
mutations to the bay thioesterase, basic kinetics and 

30 inhibition studies were performed. Progress curves of 

thioesterase activity were obtained by scaling up the assay 
volume and sampling lOOjil at 5 minute intervals into 0.5 ml 
stop solution. Kinetic assays were performed at 30°C in 
buffer containing 100 mM Tris-HCl, pH 8.0, 0.01% Triton X- 

35 100, 1 mM DTT, 10% glycerol. After extraction of each 
reaction mixture with 2.0 ml dimethyl ether, the 
radioactivity in 900 fil of the organic fraction was 
determined by liquid scintillation counting. This 
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procedure allows accurate measurement of the total 
extractable free fatty acid ( 14 C-labeled) without the 
interference of interphase between the organic and aqueous 
fractions. Production of laurate and myristate in this 
5 assay was linear with respect to time for at least 3 0 man, 
and with respect to enzyme concentrations up to 1 mU. All 
assays were done in duplicate. Initial rate data were 
fitted to the following equations using kinetics software 
from Bio-Metallics, Inc. (Kc at ) : for competitive inhibition 
10 v - t^naxS' /[-Km # app ( 1 + X / K± s ) + £] ; for noncompetitive 

inhibition v = V^S I [i^appd + I / JC ifi ) + S(l+ I / Ku)] ; 
and for uncompetitive inhibition v = V^S / [iQn /app + S (1 
+ 1 / where v is velocity; is maximum velocity; 

S is substrate concentration; J5n, a pp is apparent Michaelis 
15 constant; it is and K±± are slope and intercept inhibition 
constants, respectively; I is inhibitor concentration. 
Results of these analyses are presented in Table II below. 



Table II 

20 



Kinetic Constants of Wild-type Bay TE and 
Triple Mutant M197R/R199H/T231K 



Enzyme Km,app (pM) k± (JIM)* 

25 14:0-ACP 12 : Q-ArP 12:0-ACP 

Bay TE 6.4 + 1.9 1.9 + 0.5 10.2 ± 1.2 
(competitive) ** 

Mutant 2.3±0.4 ND 11.6+0.2 (competitive) 



30 



35 



*slope inhibition constants of 12:0-ACP with 14:0-ACP as 
varied substrates 
♦♦competitive inhibition with respect to 14:0-ACP. 
ND - not determined. 

Under the same experimental conditions, both bay 
thioesterase and the triple mutant M197R/R199H/T231K have 
similar values of i^,app with respect to 14:0-ACP. The 
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specific activity of the mutant towards 12; 0-ACP is too low 
to obtain any meaningful kinetic parameters under our 
assaying system. Nevertheless, these results indicate that 
the mutations do not significantly increase the substrate 
5 (14: 0-ACP) binding affinity to the mutant enzyme. 

Inhibition assays were conducted under the conditions 
described above using cold 12:0-ACP to compete with the 
substrate ( 14 C labeled 14: 0-ACP). Results of these assays 
are presented in Table III below. 

10 

Table III 

Inhibition of 14: 0-ACP Thioesterase Activity by 12: 0-ACP 



Enzyme Substrate (14:0-ACP) Inhibitor (12:0-ACP) 
15 Inhibition 

Concentration Concentration (|IM) 

i%J 



20 



Bay TE 5 5 53 

5 25 78 

Mutant 5 5 48 

5 25 76 



In these inhibition assays, a very similar result is 
25 seen with the wild- type and the mutant enzymes. When equal 
amounts of inhibitor (12:0-ACP) and substrate (14:0-ACP) 
are present in the assay, the 14:0-ACP TE activity is 
reduced approximately 50%. If the amount of 12:0-ACP is 5 
times that of 14:0-ACP, the 14:0-ACP TE activity is reduced 
30 more than 75%. Consistent with what has been observed 

before (Pollard et al., supra), a similar kinetic mechanism 
is used by the wild-type bay TE, i.e. both 12:0- and 14:0- 
ACP have similar J^'s, but is highly favorable for 

12:0-ACP. These data suggest that the specificity of the 
35 mutant enzyme is determined in the acyl hydrolysis step, 
that is both 12:0- and 14 : 0-ACP can bind to the mutant 
enzyme with similar affinity, however 14: 0-ACP is cleaved 
at a much higher rate. This conclusion is further 
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supported by inhibition kinetics, which show that 12 : O-ACP 
is a competitive inhibitor with respect to 14:0-ACP (K± 
values are 10.2 + 1.2 |IM and 11.6 + 0.2 |±M for the wild- 
type and mutant enzymes, respectively (Table II) . 

Thus, the amino acid substitutions described for the 
bay thioesterase apparently do not directly impact the 
substrate binding site, as 12:0-ACP is a good competitive 
inhibitor to 14:0-ACP in both the wild type and the mutant 
enzymes. In fact, the Michaelis constants are similar and 
independent of substrate length for bay thioesterase and 
the engineered bay enzyme, suggesting that specificity must 
be largely determined in the acyl hydrolytic step. Because 
the substrates (acyl-ACP) are relatively large molecules 
(M r of ACP is about 9 Kd) , it is likely that plant 
thioesterases have very relaxed binding pockets. However, 
the enzymes have high selectivities with respect to fatty 
acid chain length or structure (i.e. the presence or 
absence of double bonds) . 

Furthermore, the tripeptide Met-Arg-Arg of native bay 
thioesterase is not the sole the determining factor for 
selectivity towards 12:0-ACP, as this tripeptide is 
commonly found at the same location in other medium chain 
specific thioesterases. Therefore, the changes in the 
engineered bay thioesterases may only slightly alter 
certain secondary structures, similar to what was observed 
when surface loops of Bacillus stearothermophilus lactate 
dehydrogenase were modified (El Hawrani et al. (1994) 
Trends in Biotech. 12:207-211). Changing the tripeptide 
from M-R-R to R-R-H apparently reduced the flexibility of 
the fi-structure immediately following this tri -peptide, 
according to the predictions of chain flexibility in 
proteins (Karplus et al. (1985) Naturwiss. 11, 212-213). 
This may lead to reduction of the flexibility of the 
substrate binding pocket and active site. 

Example 8 Engineering FatA Thioesterases 

Alteration of thioesterase enzyme specificity of a 
mangosteen Garm FatAl clone is provided as an example of 
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modification of FatiA or Class I type thioesterases . 
Desirable modifications with respect to FatA thioesterases 
include alteration in the substrate specificity such that 
activity on CIS : 0 fatty acyl-ACP is increased relative to 
B activity on C18:l or C16:0 fatty acyl-ACP substrates. 

For example, in order to increase the relative 
activity on saturated fatty acids, such as C18:0, mutations 
in regions of Class I thioesterases which differ from the 
corresponding regions in Class II thioesterases, which act 

10 primarily on saturated fatty acids, may be useful. The 

data from bay thioesterase engineering experiments indicate 
that the region from amino acids 229 to 285 (as numbered in 
the top line consensus sequence on Figure 1) is important 
in thioesterase substrate binding. Amino acid sequence 

15 comparison of this region indicates that in the highly 

conserved region from amino acids 250-265, several charged 
amino acids are different in FatA as compared to FatB 
thioesterases. In FatA thioesterases, amino acid 261 is 
negatively charged with a few exceptions, whereas in FatB 

20 clones analyzed to date, amino acid 261 is in most cases 
positively charged. Also, in FatA thioesterases, amino 
acid 254 is positively charged in all FatA thioesterases 
studied to date, whereas in FatB clones analyzed to date, 
amino acid 254 is in all cases an amino acid having no 

25 charge. Thus, alteration of the amino acid charge at these 
positions may lead to alteration of substrate preference. 

A FatA TE mutant in amino acid 261 (Figure 1 concensus 
numbering), D261K of mangosteen FatAl, is generated using 
PCR site-directed mutagenesis similar to the methods 

30 described for modification of bay thioesterase sequences. 
Mutant D261K is measured for thioesterase activity as 
described above (Davies, H.M. (1993) supra) . Results of 
these analyses (Figure 12) demonstrate that the preference 
for 18:0 versus 18:1 was 35% (18:0/18:1) in mutant D261K, 

35 as compared to 25% in the wild- type Garm FatAl. Both the 
wild-type and mutant Garm FatAl clone demonstrate very low 
activity on 16:0 and no activity on medium-chain length 
substrates such as C10:0 through C14:0. An additional Garm 
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FatAl mutant was prepared having the D2 61K mutation 
indicated above, as well as a mutation to change amino acid 
254 from lysine to valine. This mutant, K254V/D261K, 
demonstrated an increased 18:0/18:1 ratio of 40%. These 
results once again supports the bay evidence which 
indicates that modification of this region can change the 
enzyme activity and specificity. A triple mutant, 
G249T/K254V/D261K, is under construction to further modify 
the Garm FatAl clone towards the FatB thioesterase 
structure for evaluation of further specificity 
modification. 

Other desirable amino acid modifications of mangosteen 
Garm FatAl clones may be selected by comparison of the 18:0 
enriched Garm FatAl thioesterase amino acid sequence to the 
15 amino acid sequence for a FatA clone having activity 

primarily on 18:1 substrates, with little or no activity on 
18:0 substrates. A comparison of the amino acid sequences 
of Garm FatAl and an 18:1 preferring thioesterase clone 
from Brassica campestris (rapa) f Br FatAl, is provided in 
20 Figure 13. In view of the binding substrate alterations 
demonstrated for the bay thioesterase in the region 
following the predicted £-sheet and turn (anchored by amino 
acids G169 and G172 of the Figure 13 mangosteen and 
Brassica thioesterase comparison) , this region is also a 
25 target for substrate specificity alteration of mangosteen 

thioesterase clone GarmFatAl. Secondary structure analysis 
and amino acid sequence comparison of the mangosteen and 
Brassica rapa Class I thioesterases result in 
identification of several target mutations for further 
altering the substrate specificity of the mangosteen 
thioesterase, GarmFatAl. Target amino acids include Y182V, 
Q186E, D209S, V210D and H219F. 

Furthermore, the unique restriction sites, Bglll and 
Spel, at amino acids 241 and 293 of Garm FatAl (numbering 
35 as in Figure 4), provide for convenient domain swapping of 
the mangosteen thioesterase region between amino acids 242 
and 293 (Figure 4 numbering) . This region contains both 
the histidine 248 and cysteine 283 active site amino acids 
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which have been identified by mutagenesis and biochemical 
assay. Thus, the major portion of the mangosteen 
thioesterase active site may be removed and replaced by the 
corresponding region (obtainable by PCR amplification) from 
5 other acyl-ACP thioesterases . Such methods allow for 
further modification of acyl-ACP thioesterase activity, 
such as increasing specific activity of the mangosteen 
thioesterase by substituting the active site of the high 
specific activity bay thioesterase clone, Uc FatBl. 

10 

Example 9 Domain Swapping Techniques 

Methods for preparing thioesterase domain swapping 
constructs where convenient restriction sites are not 
available are provided. 

15 A method for short domain swapping is illustrated in 

Figure 14. Two separate PCR result in two fragments 
(products of primers a + d, and primers b + c) , which 
contain overlapping sequence identical to the new domain. 
Primers c and d are synthesized to match the exact sequence 

20 at the 3 1 end down-stream of the original domain, plus a 5' 
overhang corresponding to new domain sequence. The length 
of the matching sequence should be long enough to give a T m 
of 50°C or above (calculated by assuming a C or G = 4°C and 
a T or A = 2°C) . Ideally, the length of the 5' overhangs 

25 should not be greater than 18 bases (6 amino acids) , 
although longer overhangs may also work at lower 
efficiencies. The first two PCR are carried out with 
approximately 0.2 jiM of primers and 0.1 \ig of template DMA 
under PCR conditions described below. The second PCR run 

30 (PCR 3) is performed by mixing 10 |ll of each product of PCR 
1 and 2, and adding primers a and b to final concentration 
of 0.2 HM. The resulting product is the targeted gene with 
the original domain replaced by a new domain sequence. The 
PCR product may be examined on an agarose gel before 

35 precipitation and restriction-digestion for subcloning. 

The modified DNA fragment should be sequenced to verify the 
desired mutation. 
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For swapping of longer domains, as illustrated in 
Figure 15, the switch of a domain from gene II to gene I 
can be achieved by first amplifying three fragments from 
PCR 1, 2, and 3. These partly overlapped fragments are then 
5 mixed together for the next PCR with primers a and b. PCR 
conditions are described below. The resulting full-length 
product is gene I with a new domain from gene II. By the 
same principle, two domains can be swapped into gene I 
simultaneously by an additional PCR in the first run, 

10 followed by the second PCR in the presence of the four 
fragments (not shown) . 

PCR conditions which have been successfully used are 
as follows: five cycles were programmed with denaturation 
for 1 min at 94°C, renaturation for 3 0 seconds at 48°C, and 

15 elongation for 2 min at 72°C. The first five cycles were 
followed by 3 0 cycles using the same program except with 
renaturation for 3 0 seconds at 60°C. The rationale for the 
first five cycles at lower temperature is to ensure 
annealing of the PCR primers with 5* overhangs. The 

20 increased temperature for the later cycles limit the 

further amplification to sequences amplified during the 
first five cycles. The T m * s for all primers should be 
designed at around 60°C. For the convenience of subsequent 
cloning, the full-length anchor primers (a and b, Fig. 14 

25 and 15) usually include additional restriction sites and/or 
overhangs for various PCR subcloning vectors. It is 
important to use as little amount of template DNA as 
possible (usually less than 0.1 \ig) to reduce the non- 
mutagenized background. 

30 



The above results demonstrate the ability to modify 
plant acyl-ACP thioesterase sequences such that engineered 
thioesterases having altered substrate specificity may be 
35 produced. Such thioesterases may be expressed in host 

cells to provide a supply of the engineered thioesterase 
and to modify the existing pathway of fatty acid synthesis 
such that novel compositions of fatty acids arc obtained. 
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In particular, the engineered thioesterases may be 
expressed in the seeds of oilseed plants to provide a 
natural source of desirable TAG molecules . 

5 All publications and patent applications mentioned in 

this specification are indicative of the level of skill of 
those skilled in the art to which this invention pertains. 
All publications and patent applications are herein 
incorporated by reference to the same extent as if each 
10 individual publication or patent application was 

specifically and individually indicated to be incorporated 
by reference. 

Although the foregoing invention has been described in 
15 some detail by way of illustration and example for purposes 
of clarity of understanding, it will be obvious that 
certain changes and modifications may be practiced within 
the scope of the appended claims. 
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What is claimed is: 

1. A method for obtaining an engineered plant acyl- 
ACP thioesterase having an altered substrate specificity 
5 with respect to the acyl-ACP substrates hydrolyzed by said 
thioesterase, wherein said method comprises: 

(a) modifying a gene sequence encoding a first plant 
thioesterase protein to produce one or more modified 
thioesterase gene sequences, wherein said modified 

10 sequences encode engineered acyl-ACP thioesterases having 

substitutions, insertions or deletions of one or more amino 
acid residues in the mature portion of said first plant 
thioesterase , 

(b) expressing said modified gene sequences in a host 
15 cell, whereby said engineered plant thioesterases are 

produced and, 

(c) assaying said engineered plant thioesterases to 
detect altered substrate specificity. 

2. A method according to Claim 1 wherein said amino 
20 acid substitutions, insertions or deletions are in the C- 
terminal two/ thirds portion of said first plant 
thioesterase . 

3 . A method according to Claim 1 wherein said amino 
acid substitutions, insertions or deletions are in the 
25 region corresponding to amino acids 230 to 285 of the 

consensus numbering of thioesterase amino acid sequences 
shown in Figure 1. 

4. A method according to Claim 1 wherein said amino 
acid substitutions, insertions or deletions are in the 

30 region corresponding to amino acids 315 to 375 of the 

consensus numbering of thioesterase amino acid sequences 
shown in Figure 1. 

5. A method according to Claim 1 wherein one or more 
amino acid residues in the mature portion of said first 

35 plant thioesterase are substituted with the corresponding 
amino acids of a second plant thioesterase, wherein the 
preferential acyl-ACP substrates for said first and second 
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plant thioesterases are different with respect to carbon 
chain length and/or degree of saturation. 

6. A method according to Claim 5 wherein said first 
thioesterase is modified by substitution of a peptide 
5 domain from said second thioesterase. 

7 . A method according to Claim 6 wherein said 
peptide domain comprises the active histidine and active 
cysteine residues of said second thioesterase protein. 

8. An engineered plant acyl-ACP thioesterase, 
10 wherein said engineered thioesterase demonstrates an 

altered substrate specificity with respect to the acyl-ACP 
substrates hydrolyzed by said thioesterase as compared to 
wild- type acyl-ACP thioesterase in said plant. 

9. An engineered thioesterase of Claim 8, wherein 
15 said wild- type thioesterase is a Class II thioesterase. 

10. An engineered thioesterase of Claim 8, wherein 
said wild- type thioesterase is a Class I thioesterase. 

11. A DNA sequence encoding an engineered plant acyl- 
ACP thioesterase, wherein said engineered thioesterase 

20 demonstrates an altered substrate specificity with respect 
to the acyl-ACP substrates hydrolyzed by said thioesterase 
as compared to the wild- type plant acyl-ACP thioesterase. 

12. A DNA sequence of Claim 11, wherein said wild- 
type thioesterase is a Class II thioesterase. 

25 13. A DNA sequence thioesterase of Claim 11, wherein 

said wild-type thioesterase is a Class I thioesterase. 
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